Skip to content

Tests overhaul#143

Merged
crusaderky merged 6 commits into
explosion:mainfrom
crusaderky:tests-overhaul
Dec 8, 2025
Merged

Tests overhaul#143
crusaderky merged 6 commits into
explosion:mainfrom
crusaderky:tests-overhaul

Conversation

@crusaderky
Copy link
Copy Markdown
Contributor

@crusaderky crusaderky commented Dec 8, 2025

This PR thoroughly revisits the test suite. It's made of several commits, each with individual comments:

  • Tweak test tolerances
    • Tighten tolerance for float64 from 1e-3 ~ 1e-4 to 1e-9.
    • Relax tolerance for float32 to 1e-2, as increasing the number of valid examples lead to the discovery of dotv: float32 rounding error is 10x of NumPy (dotv: float32 rounding error is 10x of NumPy #142).
    • Swap actual <-> desired in assert_almost_equal, as the two are not symmetric.
  • Tweak strategies ranges
    • Add test coverage for arrays of size 1.
    • Remove unused, confusing default parameters from custom strategies.
  • Add central control for number of examples tested
    This is meant to be tampered with locally for more thorough (slower) tests.
  • New test for dotv invalid use case
    Mirrors same test in test_gemm.py.
  • Overaul hypothesis strategies
    • Improve readability.
    • Given infinite hypothesis examples, this commit does not introduce any functional changes.
      However, given a fixed and relatively low max_examples setting, it substantially increases test coverage:
      1. It prevents examples that were previously skipped by assume(). According to pytest --hypothesis-show-statistics, these were ~10% for dotv and ~20% for gemm.
      2. It prevents examples that ended up being duplicates due to trimming. For example, in dotv, hypothesis could previously generate examples e.g. A=[1,2,3,4], B=[5,6] and A=[1,2], B=[5,6,7,8]; both would get trimmed to A=[1,2], B=[5,6].
      3. In gemm, it removes an entire degree of freedom by removing unused variable out_col, which again would result in duplicate examples.
  • Add threading tests for shared input
    Given input arrays that are shared between multiple threads, test that you can run dotv and gemm on them in parallel from multiple threads.

Tighten tolerance for float64 from 1e-3 ~ 1e-4 to 1e-9.
Relax tolerance for float32 to 1e-2, as increasing the number of valid examples lead to the discovery of dotv: float32 rounding error is 10x of NumPy  (explosion#142).
Swap actual <-> desired in assert_almost_equal, as the two are not symmetric.
Add test coverage for arrays of size 1.
Remove unused, confusing default parameters from custom strategies.
This is meant to be tampered with locally for more thorough (slower) tests.
Mirrors same test in test_gemm.py.
Improve readability. 
Given infinite hypothesis examples, this commit does not introduce any functional changes.
However, given a fixed and relatively low max_examples setting, it substantially increases test coverage:

1. It prevents examples that were previously skipped by assume(). According to pytest --hypothesis-show-statistics, these were ~10% for dotv and ~20% for gemm.

2. It prevents examples that ended up being duplicates due to trimming. For example, in dotv, hypothesis could previously generate examples e.g. A=[1,2,3,4], B=[5,6] and A=[1,2], B=[5,6,7,8]; both would get trimmed to A=[1,2], B=[5,6].

3. In gemm, it removes an entire degree of freedom by removing unused variable out_col, which again would result in duplicate examples.
@crusaderky crusaderky marked this pull request as ready for review December 8, 2025 18:06
# Copyright ExplosionAI GmbH, released under BSD.
import numpy as np

np.random.seed(0)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This did nothing: https://hypothesis.readthedocs.io/en/latest/reference/strategies.html#hypothesis.strategies.random_module

Hypothesis always seeds global PRNGs before running a test, and restores the previous state afterwards.

@crusaderky crusaderky merged commit 14781ae into explosion:main Dec 8, 2025
64 checks passed
@crusaderky crusaderky deleted the tests-overhaul branch December 8, 2025 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant